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Abstract 

The class of weakly acyclic games, which includes potential games and 
dominance-solvable games, captures many practical application domains. 
In a weakly acyclic game, from any starting state, there is a sequence of 
better-response moves that leads to a pure Nash equilibrium; informally, 
these are games in which natural distributed dynamics, such as better- 
response dynamics, cannot enter inescapable oscillations. We establish a 
novel link between such games and the existence of pure Nash equilibria 
in subgames. Specifically, we show that the existence of a unique pure 
Nash equilibrium in every subgame implies the weak acyclicity of a game. 
In contrast, the possible existence of multiple pure Nash equilibria in 
every subgame is insufficient for weak acyclicity in general; here, we also 
systematically identify the special cases (in terms of the number of players 
and strategies) for which this is sufficient to guarantee weak acyclicity. 



1 Introduction 

In many domains, convergence to a pure Nash equilibrium is a fundamental 
problem. In many engineered agent-driven systems that fare best when steady 
at a pure Nash equilibrium, convergence to equilibrium is expected [HUH] to 
happen via better-response (or best-response) dynamics: Start at some strategy 
profile. Players take turns, in some arbitrary order, with each player making 
a better response (best response) to the strategies of the other players, i.e., 
choosing a strategy that increases (maximizes) their utility, given the current 
strategies of the other players. Repeat this process until no player wants to 
switch to a different strategy, at which point we reach a pure Nash equilibrium. 
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For better-response dynamics to converge to a pure Nash equilibrium re- 
gardless of the initial strategy profile, a necessary condition is that, from every 
strategy profile, there exist some better-response improvement path (that is, 
a sequence of players' better responses) leading from that strategy profile to a 
pure Nash equilibrium. Games for which this property holds are called "weakly 
acyclic games" [121119^1 . Both potential games [T31[T7] and dominance-solvable 
games [15] are special cases of weakly acyclic games. 

In a game that is not weakly acyclic, there is at least one starting state 
from which the game is guaranteed to oscillate indefinitely under better-/best- 
response dynamics. Moreover, the weak acyclicity of a game implies that natural 
decentralized dynamics (e.g., randomized better-/best-response, or no-regret 
dynamics) are stochastically guaranteed to reach a pure Nash equilibrium [TTJ 
[T9] , Thus, weakly acyclic games capture the possibility of reaching pure Nash 
equilibria via simple, local, globally-asynchronous interactions between strategic 
agents, independently of the starting state. We assert this is the realistic notion 
of "convergence" in most distributed systems. 

1.1 A Motivating Example 

We now look at an example inspired by interdomain routing that has this natural 
form of convergence despite it being, formally, possible that the network will 
never converge. In keeping with results that we study here, we consider best- 
response dynamics of a routing model in which each node can see each other 
node's current strategy, i.e., its "next hop" (the node to which it forwards 
its data en route to the destination), as contrasted with models where nodes 
depending on path announcements to learn this information. (Levin et al. [9] 
formalized routing dynamics in which nodes learn about forwarding through 
path announcements.) 




Figure 1: Instance of the interdomain routing game that is weakly acyclic and 
has a best-response cycle. 

Consider the network on four nodes shown in Fig. [T] Each of the nodes 1, 
2, and 3 is trying to get a path for network traffic to the destination node d. 
A strategy of a node i is a choice of a neighbor to whom i will forward traffic; 
the strategy space of node i, Si, is its neighborhood in the graph. The utility 

1 In some of the economics literature, the terms "weak finite-improvement path property" 
(weak FIP) and "weak finite best-response path property" (weak FBRP) are also used, for 
weak acyclicity under better- and best-response dynamics, respectively. 
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of the destination d is independent of the outcome, and the utility Ui of node 
i ^ d depends only on the path that i's traffic takes to the destination (and is 
— oo if there is no path). We only need to consider the relationships between the 
values of it, on all possible paths; the actual values of the utilities do not make a 
difference. Using 132d to denote the path from 1 to 2 to 3 to d, and similarly for 
other paths, here we assume the following: «i(132d) > Ui(ld) > ui(13d) > — oo; 
u 2 (213d) > u 2 (2d) > u 2 {21d) > -oo; u 3 (321d) > u 3 (3d) > u 3 (32d) > -oo; and 
Ui(P) = — oo for all other paths P, e.g., Ui(12d) = — oo. These preferences 
are indicated by the lists of paths in order of decreasing preference next to the 
nodes in Fig. [TJ 

The unique pure Nash equilibrium in the game in Fig. [T] is (d,d,d), and, 
ideally, the dynamics would always converge to it. However, there exists a best- 
response cycle in this game as shown in Fig. [2] Here, each triple lists the paths 
that nodes 1, 2, and 3 get; the nodes' strategies correspond to the second node 
in their respective paths. The node above the arrow between two triples is the 
one that makes a best response to get from one triple to the next. 



Id 132d 13d 13d Id Id Id 

2d 2d 2d 213d 21d 21d 2d 

32d 32d 3d 3d 3d 321d 32d 



Figure 2: A best-response cycle for the game in Fig.Q] 

Once the network is in one of these statesd there is a fair activation sequence 
(i.e., in which every node is activated infinitely often) such that each activated 
node best responds to the then-current choices of the other nodes and such that 
the network never converges to a stable routing tree (a pure Nash equilibrium). 

Although this cycle seems to suggest that the network in Fig. [1] would be 
operationally troublesome, it is not as problematic as we might fear. From 
every point in the state space, there is a sequence of best-response moves that 
leads to the unique pure Nash equilibrium. We may see this by inspection in 
this case, but this example also satisfies the hypotheses of our main theorem 
below. So long as each node has some positive probability of being the next 
activated node, then, with probability 1, the network will eventually converge 
to the unique stable routing tree, regardless of the initial configuration of the 
network. 

2 For example, this might happen if the link between 2 and d temporarily fails. 2 would 
always choose to send traffic to f (if anywhere) ; 1 would eventually converge to sending traffic 
directly to d (with 2 sending its traffic to f), and 3 would then be able to send its traffic along 
32fd. Once the failed link between 2 and d is restored, 2's best response to the choices of the 
other nodes is to send its traffic directly to d, resulting in the first configuration of the cycle 
above. 
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1.2 Our Results 



Weak acyclicity is connected to the study of the computational properties of sink 
equilibria [3][S] , minimal collections of states from which best- response dynamics 
cannot escape: a game is weakly acyclic if and only if all sinks are "singletons" , 
that is, pure Nash equilibria. Unfortunately, Mirrokni and Skopalik [T3] found 
that reliably checking weak acyclicity is extremely computationally intractable 
in the worst case (PSPACE-complete) even in succinctly-described games. This 
means, inter alia, that not only can we not hope to consistently check games in 
these categories for weak acyclicity, but we cannot even hope to have general 
short "proofs" of weak acyclicity, which, once somehow found, could be tractably 
checked. 

With little hope of finding robust, effective ways to consistently check weak 
acyclicity, we instead set out to find sufficient conditions for weak acyclicity: 
finding usable properties that imply weak acyclicity may yield better insights 
into at least some cases where we need weak acyclicity for the application. 

In this work, we focus on general normal-form games. Potential games, the 
much better understood subcategory of weakly acyclic games, are known to have 
the following property, which we will refer to as subgame stability, abbreviated 
SS: not only does a pure Nash equilibrium exist in the game, but a pure Nash 
equilibrium exists in each of its subgames, i.e., in each game obtained from the 
original game by the removal of players' strategies. Subgame stability is a useful 
property in many contexts. For example, in network routing games, subgame 
stability corresponds to the important requirement that there be a stable routing 
state even in the presence of arbitrary network malfunctions [6]. We ask the 
following natural question: When is the strong property of subgame stability 
sufficient for weak acyclicity? 

Yamamori and Takahashi [18] prove the following two result^: 

Theorem: [18] In 2-player games, subgame stability implies weak acyclicity, 
even under best response. 

Theorem: [18] There exist 3x3x3 games for which subgame stability holds 
that are not weakly acyclic under best response. 

Thus, subgame stability is sufficient for weak acyclicity in 2-player games, 
yet is not always sufficient for weak acyclicity in games with n > 2 players. 
Our goal in this work is to (1) identify sufficient conditions for weak acyclicity 
in the general n-player case; and (2) pursue a detailed characterization of the 
boundary between games for which subgame stability does imply weak acyclicity 
and games for which it does not. 

Our main result for n-player games shows that a constraint stronger than SS, 
that we term "unique subgame stability" (USS), is sufficient for weak acyclicity: 

Theorem: // every subgame of a game T has a unique pure Nash equilibrium 
then r is weakly acyclic, even under best response. 

3 Yamamori and Takahashi use the terms quasi- acyclicity for weak acyclicity under best 
response, and Pure Nash Equilibrium Property (PNEP) for subgame stability. 
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Table 1: Results summary: The impact of USS/SSS/SS on weak acyclicity 
(with M > 2): J marks classes with guaranteed weak acyclicity, even under 
best response; X marks classes with examples that are not weakly acyclic even 
under better response. *: only for strict games 

This result casts an interesting contrast against the negative result in |18j : 
unique equilibria in subgames guarantee weak acyclicity, but the existence of 
more pure Nash equilibria in subgames can lead to violations of weak acyclicity. 
Hence, perhaps counter-intuitively, too many stable states can potentially result 
in persistent instability of local dynamics. (A similar phenomenon is seen in 
recent work of Jaggard et al. [7] , which studied settings in which multiple stable 
states preclude the possibility of a non-probabilistic guarantee of convergence.) 

We consider SS games, USS games, and also the class of strict and subgame 
stable games SSS, i.e., subgame stable games which have no ties in the utility 
functions. We observe that these three classes of games form the hierarchy 
USS C SSS C SS. We examine the number of players, number of strategies, 
and the strictness of the game (the constraint that there are no ties in the 
utility function), and give a complete characterization of the weak acyclicity 
implications of each of these. Our contributions are summarized in Table [TJ 

1.3 Other Related Work 

Weak acyclicity has been specifically addressed in a handful of specially-struc- 
tured games: in an applied setting, BGP with backup routing pQ, in a game- 
theoretical setting, games with "strategic complementarities" [U[5] (a super- 
modularity condition on lattice-structured strategy sets) , and in an algorithmic 
setting, in several kinds of succinct games [13]. Milchtaich [12] studied Rosen- 
thal's congestion games [17] and proved that, in interesting cases, such games 
are weakly acyclic even if the payoff functions (utilities) are not universal but 
player-specific. Marden et al. [TO] formulated the cooperative-control-theoretic 
consensus problem as a potential game (implying that it is weakly acyclic) ; they 
also defined and investigated a time- varying version of weak acyclicity. 

1.4 Outline of Paper 

In the following, we recall the relevant concepts and definitions in Section [21 
present our sufficient condition for weak acyclicity in Section [31 and our char- 
acterization of weak acyclicity implications in Section 21 
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2 Weakly acyclic games and subgame stability 



We use standard game-theoretic notation. Let F be a normal-form game with 
n players 1, ...,n. We denote by Si be the strategy space of the i th player. 
Let S = Si X ... X S n , and let S-i = Si x . . . x Si-i x S i+ i x . . . x S n be 
the cartesian product of all strategy spaces but Si. Each player i has a utility 
function Ui that specifies i's payoff in every strategy-profile of the players. For 
each strategy Si £ Si, and every (n — l)-tuple of strategies s~i <E S-i, we denote 
by Ui(si,s^i) the utility of the strategy profile in which player i plays Si and 
all other players play their strategies in s_^. We will make use of the following 
definitions. 

Definition 1 (better- response strategies). A strategy s^ € Si is a better- 
response of player i to a strategy profile (si, s_;) if U;(s^, s_^) > Ui(sj, s_^). 

Definition 2 (best-response strategies). A strategy Si € Si is a &es£ response of 
player z to a strategy profile s_i G SLj of the other players if Si € argmax s / gS .Ui(s^, s~i) 

Definition 3 (pure Nash equilibria). A strategy profile s is a pure Nash equi- 
librium if, for every player i, Sj is a best response of i to s-i. 

Definition 4 (better- and best-response improvement paths). A better-response 
(best-response) improvement path in a game T is a sequence of strategy profiles 
s 1 , . . . ,s k such that for every j £ [k—l] (1) s J and s^ +1 only differ in the strategy 
of a single player i and (2) i's strategy in s J+1 is a better response to s J (best 
response to s^_ i and Uj(sj + jS^J > Uj(s^, s^)). The better-response dynamics 
(best-response dynamics) graph for V is the graph on the strategy profiles in 
r whose edges are the better-response (best-response) improvement paths of 
length 1. 

We will use ARr(s) and BRr(s) to denote the set of all states reachable by, 
respectively, better and best responses when starting from s in F. 

We are now ready to define weakly acyclic games [TS]. Informally, a game 
is weakly acyclic if a pure Nash equilibrium can be reached from any initial 
strategy profile via a better-response improvement path. 

Definition 5 (weakly acyclic games). A game T is weakly acyclic if, from every 
strategy profile s, there is a better-response improvement path s 1 . . . , s k such 
that s 1 = s, and s k is a pure Nash equilibrium in T. (I.e., for each s, there's a 
pure Nash equilibrium in ARr(s).) 

We also coin a parallel definition based on best-response dynamics. 

Definition 6 (weak acyclicity under best response). A game T is weakly acy- 
clic under best response if, from every strategy profile s, there is a best-response 
improvement path s 1 . . . , s k such that s 1 = s and s k is a pure Nash equilibrium 
in r. (I.e., for each s, there's a pure Nash equilibrium in BRr(s).) 
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Weak acyclicity of either kind is equivalent to requiring that, under the 
respective dynamics, the game has no "non-trivial" sink equilibria [3[5], i.e., 
sink equilibria containing more than one strategy profile. Conventionally, sink 
equilibria are defined with respect to best-response dynamics, but the original 
definition by Goemans et al. [5] takes into account better-response dynamics as 
well. 

The following follows easily from definitions: 

Claim. // a game is weakly acyclic under best response then it is weakly acyclic. 

Proof. If r is weakly acyclic under best response, the paths to equilibrium from 
each edge will still be there if we augment the state space with additional better- 
response transitions. On the other hand, the game in Figure [3l mentioned in, 
e -g-i Eh is weakly acyclic, but it is not weakly acyclic under best response. □ 





H 


T 


X 


H 


r 

0,2+- 


— 2,0 \ 


0,0 


T 


0,0 


X 


0,0 


1,0* 


--►3,3 



Figure 3: Matching pennies with a "better- response" escape route (dashed ar- 
rows), but a best response persistent cycle (solid arrows) [TT] . 

Curiously, all of our results apply both to weak acyclicity in its conventional 
better-response sense and to weak acyclicity under best response. Thus, unlike 
weak acyclicity itself, the conditions presented in this paper are "agnostic" to 
the better-/best-response distinction (like the notion of pure Nash equilibria 
itself). 

We now present the notion of subgame stability. 

Definition 7 (subgames). A subgame of a game T is a game V obtained from 
T via the removal of players' strategies. 

Definition 8 (subgame stability). Subgame stability is said to hold for a game 
r if every subgame of T has a pure Nash equilibrium. We use SS to denote the 
class of subgame stable games. 

Definition 9 (unique subgame stability). Unique subgame stability is said to 
hold for a game T if every subgame of T has a unique pure Nash equilibrium. 
We use USS to denote the class of such games. 

We will also consider games in which no player has two or more equally good 
responses to any fixed set of strategies played by the other players. Following, 
e.g., |16) . we define strict games as follows. 

Definition 10 (strict game). A game T is strict if, for any two distinct strategy 
profiles s — (si, . . . , s n ) and s' = (s[, . . . , s' n ) such that there is some j £ [n] for 
which s' — (s'j, s_j) (i.e., s and s' differ only in j's strategy), then ttj(s) ^ Uj(s'). 



7 



Definition 11 (SSS). We use SSS to denote the class of games that are both 
strict and subgame stable. 

It's easy to connect unique subgame stability and strictness. To do so, we 
use the next definition, which will also play a role in our main proofs. 

Definition 12 (subgame spanned by profiles). For game T with n players and 
profiles s 1 , . . . , s k in T, the subgame spanned by s 1 , . . . , s k is the subgame r" of 
T in which the strategy space for player i is S£ = {s\ |1 < j < k}. 

Claim. The categories USS, SSS, and SS form a hierarchy: USS C SSS C SS 

Proof. SSS C SS by definition. To see that USS C SSS observe the following. 
If a game is not strict, there are Sj,s^ S Sj and s_j such that iij(sj, s-j) — 
Uj(s'j 1 s^j). Both strategy profiles in the subgame spanned by (sj,S-j) and 
(Sj-,s_j) are pure Nash equilibria, violating unique subgame stability. □ 

3 Sufficient condition for weak acyclicity with n 
players 

When is weak acyclicity guaranteed in n-player games for n > 3? We prove 
that the existence of a unique pure Nash equilibrium in every subgame implies 
weak acyclicity. We note that this is not true when subgames can contain 
multiple pure Nash equilibria |18) . Thus, while at first glance, introducing 
extra equilibria might seem like it would make it harder to get "stuck" in a non- 
trivial component of the state space with no "escape path" to an equilibrium, 
this intuition is false; allowing extra pure Nash equilibria in subgames actually 
enables the existence of non-trivial sinks. 

Theorem 3.1. Every game T that has a unique pure Nash equilibrium in 
every subgame f C T is weakly acyclic under best-response (as are all of its 
subgames) . 

The proof of this theorem uses the following technical lemma: 

Lemma 3.2. If s is a strategy profile in T, and V is the subgame of T spanned 
by BRr(s), then any best-response improvement path s, s 1 , . . . , s k in V that 
starts at s is also a best-response improvement path in T. 

Proof. We proceed by induction on the length of the path. The base case is 
tautological. Inductively, assume s, . . . , Si is a best-response improvement path 
in r. The strategy s l+1 is a best response to s l in T' by some player j. This 
guarantees that s l is not a best response by j to s l _j in V, let alone in T, so V D 
BRr(s) 3 BRr(s z ) must contain a best-response s* to in T, and because 
s* +1 is a best-response in V , we are guaranteed that Wj(s}, s!_j) = Uj(s l+1 ), so 
s l+1 must be a best-response in V. □ 

We may now prove Theorem 13. II 
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Proof of Theorem \S.1[ To prove Theorem 13.11 assume that T is a game satis- 
fying the hypotheses of the theorem, and for a subgame ACT, denote by sa 
the unique pure Nash equilibrium in A. We will proceed by induction up the 
semilattice of subgames of T. The base cases are trivial: any 1 x ■ • • x 1 subgame 
is weakly acyclic for lack of any transitions. Suppose that for some subgame T' 
of game T we know that every strict subgame V" C r' is weakly acyclic. 

Suppose that r' is not weakly acyclic: it has a state s from which its unique 
pure Nash equilibrium sr> cannot be reached by best responses. Let T" be the 
game spanned by BR(s). Consider the cases of (i) sr> £ T" and (ii) sr> T": 

Case (i): sr> £ T" . This requires that, for an arbitrary player j with more 
than 1 strategy in V , there be a best-response improvement path from s to some 
profile s where j plays the same strategy as it does in sr". Take one such j, and 
let r J be the subgame of V where j is restricted to playing §j only. Because 
sr' is in L 7 , the inductive hypothesis guarantees a best-response improvement 
path in T- 7 from s to sr>- By construction, that path must only involve best 
responses by players other than j, who have the same strategy options in T J 
as they did in T', so that path is also a best- response improvement path in T', 
assuring a best- response improvement path in T' from s to sr" via s. 

Case (ii): sr 1 ^ T". Then, r"'s unique pure equilibrium sr" must be distinct 
from sr'- Because sr> is the only pure equilibrium in F', sr" must have an 
outgoing best- response edge to some profile § in T' . But the inductive hypothesis 
ensures that sr" £ BRr"(s); by Lemma HOI sr" £ BRp'(s), which then ensures 
that s must also be in BRr>(s), and hence in T", so sr" isn't an equilibrium in 

r". □ 



4 Characterizing the implications of subgame 
stability 

Yamamori and Takahashi [IB] established that in 2-player games, subgame sta- 
bility implies weak acyclicity, even under best response, yet this is not true in 
3x3x3 games. We now present a complete characterization of when subgame sta- 
bility is sufficient for weak acyclicity, as a function of game size and strictness. 
Our next result shows that the two-player theorem of |18j is maximal: 

Theorem 4.1. Subgame stability is not sufficient for weak acyclicity even in 
non-strict 2x2x2 games. 

Proof. The non-strict 2x2x2 game in Fig. @] satisfies subgame stability but 
is not weakly acyclic — the states other than (ao,6o;Co) and (ai,6i,ci) form a 
non-trivial sink. □ 

However, if we require the games to be strict, subgame stability turns out 
to be somewhat useful in 3-player games: 

Theorem 4.2. In any strict 2 x 2 x M or 2 x 3 x M game, subgame stability 
implies weak acyclicity, even under best response. 
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Figure 4: A non-strict 2x2x2 subgame-stable game with a non-trivial sink 

We will first need a couple of technical lemmas: 

Lemma 4.3. In strict games, neither a pure Nash equilibrium nor strategy 
profiles differing from it in only one player's action can be part of a non-trivial 
sink of the best-response dynamics. 

Proof. A pure Nash equilibrium always forms a 1-node sink. If the game is 
strict, profiles differing by one player's action have to give that one player a 
strictly lower payoff, requiring a best-response transition to the equilibrium's 
sink. Any node connected to either cannot be in a sink. □ 

Lemma 4.4. The profiles of a game that constitute a non-trivial sink of the 
best-response dynamics cannot be all contained within a subgame that is weakly 
acyclic under best-response. 

Proof. Consider such a non-trivial sink of game T contained in such a subgame 
r'. Take a profile s in the sink, and consider the path P = {s = s°, s 1 , . . . , s k } of 
r' best responses that leads to s k , an equilibrium of V. This path is guaranteed 
to exist because V is weakly acyclic under best response. Consider the last 
profile in P, s a , such that all profiles on P between s and s a are in the sink. 

If s a = s k (i.e., if P is entirely in the sink), there has to be a best response 
transition in T from s k to some s' , because s k cannot be an equilibrium of T 
and be in a non-trivial sink. If s' were in T', the transition from s k to s' would 
have been a best response in T' , too, contradicting s k being an equilibrium of 
r' — thus, s' is not in T', but is in the sink. 

If s a ^ s fc , the transition from s a to s a+1 , by some player i, is a best response 
in T', but not in T. So s° is not z's best response to st i , and thus there is a best 
response by i from s a to some s' in T. Because s a to s a+1 is not a best response 
transition by in T, Ui(s') > Ui(s a+1 ), and because s a+1 is a best response in T' , 
s' must not be in V — but because s a is in the sink, so is s' ^ V . □ 

We now start with the corner cases of 3-player, 2x2x2 games, and 2-player, 
2 x m games, where weak acyclicity requires even less than subgame stability. 
The former result forms the base case for Theorem 14. 2 1 and both might also be 
of independent interest. 

Lemma 4.5. In any 2 x m game, if there is a pure Nash equilibrium, then the 
game is weakly acyclic, even under best response. 

Proof. In general 2 x to games with pure Nash equilibrium (s*, £*), a non-trivial 
best-response sink cannot consist of moves by just one player. Thus, the first 
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player will play both of his strategies somewhere in such a sink, including s*, 
so there is some (s* , t') state in the sink. If t' is not a best response to s* , there 
would be a best-response transition to the equilibrium (s*,t*), which couldn't 
happen in a sink. If t' is a best response to s* , and s* is a best response to t', 
then (s* ,t') is a Nash equilibrium, which couldn't happen in a sink. Lastly, if t' 
is a best response to s*, but s* is not a best response to i', there has to be some 
inbound best-response transition into (s*,t') from another profile in the sink, 
and that transition then has to involve a move by player 2, from some other 
state (s* ,t"), guaranteeing that t" is not a best response to s* . Because t* has 
to also be a best response to s*, there is then a best response transition from 
(s* ,t") to the equilibrium (s* ,t*), concluding the proof. □ 

Lemma 4.6. In any strict 2x2x2 game, if there is a pure Nash equilibrium, 
the game is weakly acyclic, even under best response. 

Proof. In strict 2x2x2 games, Lemma 14.31 leaves 4 other strategy profiles, 
with the possible best-response transitions forming a star in the underlying 
undirected graph. Because best-response links are antisymmetric (s — > s' and 
s' — > s cannot both be best-response moves if they differ only in a single player's 
action), there can be no cycle among those 4 profiles, and thus no non-trivial 
sink components. □ 

Proof of Theorem \4-S\ We treat the 2 X 2 X M and 2 x 3 x M cases separately. 
The 2 x 2 x M case: With Lemma l4~6l as the base case, assume, inductively, that 
the 2 x 2 x M claim holds for all values of M through some M' — 1, and suppose 
some 2 x 2 x M' game T, with strategy sets {ao,i} (i.e., the set containing the 
strategies clq and ai), {bo,i}, and {co,...,Af'-i}, has a non-trivial best-response 
sink X. Without loss of generality, let (ao, bo, Co) be an equilibrium of L. 

Lemma 14.41 guarantees that X is not contained in the subgame L_ Co , where 
only strategy Co is removed, leaving a strict, subgame stable 2 x 2 x M'—l game, 
which is weakly acyclic under best response by the inductive hypothesis. But 
the only profile using Co that is allowed to be in X after applying Lemma 14.31 is 
(oi) b\, Co), from which the same lemma guarantees that only player 3 can make 
a best- response transition in X. Thus, it can have no inbound best-response 
transitions by player 3, leaving no way for it to be reached from the rest of X , 
which can thus not be a sink. 

The 2 x 3 x M case: The 2x3x2 case is isomorphic to the above. With that 
as the base case, assume, inductively, that the 2 x 3 x M claim holds for all 
values of M up to some M' — 1, and suppose that a 2 x 3 x M' game T, with 
strategy sets {ao,i}>{ko,i,2}? an d {co,...,af-i} nas a non-trivial sink X. Without 
loss of generality, let (ao, &o, cq) be a pure Nash equilibrium of T. The inductive 
hypothesis and Lemma 14.41 guarantee that X spans T, and, in particular, that 
it has at least one node of form (*, *, Cq), and at least one of form (*, bo, *)■ 

By Lemma [4.31 the (*, *, Co) node has to be one of the two nodes (a\, b\ t 2, Co), 
and that node cannot have an outbound best response by player 1. To be in a 
non-trivial sink, it has to have an inbound and an outbound best response, one 
of which is thus by player 3, and the other by player 2, ensuring that both of 
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the two nodes (ai, 61,2, cq) are in X. One of those will then be player 2's best 
response to (ai,Co); without loss of generality, let that one be (ai,&i,co). Then, 
the only inbound best response to lead to (ai,& 2 ,co) is by player 3, and player 
3 has to have an outbound best response from (ai, &i, Co) to some (ai, bi, c x ). 

From (ai,bi,Cx), if there is an outbound best response by player 2, it cannot 
be to (ai, &2, c x ): otherwise, the subgame with strategies {ai}, {61,2}, and {co, x } 
is isomorphic to Matching Pennies. Player 2's best response would thus have to 
instead be to (at,bo,c x ). From there, player 1 cannot have an outbound best 
response by Lemma B~B1 thus requiring player 3 to have a best response to some 
(ai,bo,Cy); from there, too, player 1 cannot have a best response by Lemma 
14.31 requiring a best response by player 2. But then, in the 1 x 3 x M' subgame 
formed by removing strategy do, for each of player 2's strategies, player 3's best 
response is to a profile that has an outbound best response by player 2, which 
precludes an equilibrium. 

Thus, from (ai, b\, c x ), the sole possible outbound best response is by player 
1, to (a Q ,bi,c x ). 

Consider now the 2 x 2 x M' subgame r_h formed by taking away strategy 
60, and let s* be its pure Nash equilibrium. If s* is of form (do, 61,2, Co), that 
would require that it be player l's best response in T to (&i, 2 , cq), thus putting s* 
in the sink, in violation of Lemma l4.3l The pure Nash equilibrium s* also cannot 
be of form (ai, 61,2, cq): otherwise, it is in the sink, and yet the only outbound 
best responses in T must be those not in T', i.e., by player 2 to (<zi,&o,co), hi 
violation of Lemma 14.31 Thus, s* has player 3 playing a strategy other than 
Co, which is its best response to one of (01,61,2)- Player 3's best response to 
(ai, 62) is Co, so that cannot be s*. Player 3's best response to (01, 61) is c x , but 
there is an outbound best response by player 1 to (ao, &i, Cx) from there, which 
is within r_b . Thus, s* = (ao,b y ,c z ), for y G {1,2}. 

Then, s* again cannot be in X, because the sole outbound transition in T 
could only be to (ao,&o,*), violating Lemma [4.31 Neither (ai,b y ,c z ) nor any 
(ao, by, *) profile can be in X, either, because their best-response transition to s* 
in r' would also be a best response in T, putting s* into X. If the one (do, b v ,c z ) 
profile with ^ ti ^ y were in X, it has a best response to s* in T' , so it has 
to have a best response by player 2 in T — but it cannot be to (ao,bo,c z ) by 
Lemma 14.31 and cannot be to s* because s* cannot be in X. Thus, much like 
in Lemma 2, no profile differing in at most one player's strategy from s* can be 
in X, either. 

We can now show that nodes (ai,bo,v,c z ) are both in X, by an argument 
symmetrical to that for (01,61,2,0)). The same argument will yield that either 
b v or &o is the best response to (ai,c z ), and that the other one of the two is a 
best response by player 3. We finish by analyzing the cartesian product of those 
two cases, and whether v £ {1,2}: 

Case: v = 1, b v is best response. The above argument will require that 
the best response to (ai,&i) by player 3, c x , be neither Co nor c z . If the out- 
bound best response from (ai, bi,c x ) is by player 2, to 60 or 62, then either 
{ai} x {60.1} x {cz,x} or {ai} x {61.2} x {co.^}, respectively, form a subgame 
isomorphic to matching pennies. On the other hand, suppose the outbound best 
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response from (ai,bi,c x ) is by player 1, to (a ,b 1} c x ). Since (a ,b ,*) nodes 
and (ao, 62, *) nodes may not be in X, the only outbound response from there 
is by player 3, to some c w , from which the only outbound best response is by 
player 1 to (oi,6i,Ctu), creating a matching pennies subgame with strategies 
{a ,i} x {bi} x {c w . x }. 

Case: v = 1, 60 is best response. In this case, c z has to be the best response 
to (01, 61) by player 3, requiring that c x = c z , but it was established above that 
(a\,bi,c x ) cannot have an outbound best response by player 2. 

Case: v = 2, b v is best response. An argument symmetric to the v = 1, 
6 = case will show that Co, the requisite best response to (a±, bi) cannot have 
an outbound best response by player 2. 

Case: v = 2, bo is best response. This gives a contradiction, because it 
would require both c z and cq to be the best response to (aj., 62)- 

Thus, r cannot have a non-trivial best-response sink. □ 

Theorem 14.21 is maximal. All bigger sizes of 3-player games admit subgame- 
stable examples that are not weakly acyclic: 

Theorem 4.7. In non-degenerat^] strict 3-player games, the existence of pure 
Nash equilibria in every subgame is insufficient to guarantee weak acyclicity, for 
any game with at least 3 strategies for each player, and any game with at least 
4 strategies for 2 of the players. 

Proof. The first half of the theorem follows directly from a specific example 
game in [18]. There, the strict 3-player, 3x3x3 game in question is stated 
to demonstrate that SSS does not imply weak acyclicity under best response. 
Actually, their very same example is not even weakly acyclic under better re- 
sponse. Here, we examine a 2 x 4 x 4 example to establish the second half 
of the theorem, and a 3 x 3 x 3 example that is slightly cleaner than the one 
in [18] , both shown in Figure [5] 

In each of these three player games, there is a pure Nash equilibrium in 
the full game, s* — (ax,bi,cx) in 1^333, and s* = (03,63,01) in r 4 4 2 , with 
utility 5 for each of the players. In both, there is a cycle C, every profile in 
which differs from s* in at least 2 players' strategies. Any profile (a,i,bj,Ck) 
that's neither s* nor in C yields utilities (i, j, k). With utilities in C always in 
{4,5}, there is never an incentive for anyone to unilaterally leave the cycle C, 
forming a "sheath" of low-utility states separating C from the rest of the game, 
particularly s* . Thus C is a persistent cycle. By construction, the game is strict 
and at each state in C there is a unique player who has a better response to the 
current state. 

Consider any subgame V of cither game. If V contains s* , s* is a pure Nash 
equilibrium of T' as well. 

Suppose r' is not the full game. In the course of cycling through C, each 
strategy of each player is used at least once. Thus, V cannot contain all of 
C. If it has at least some states of C, pick one state that is in V, and follow 

4 Each player has 2 or more strategies 
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Figure 5: 3-player strict subgame stable games that are not weakly acyclic, even 
under better-response dynamics 

the edges of C until you get to a state whose sole outbound better-response 
move has been "broken" by the better-response strategy being removed in V . 
This process will terminate, because C is a simple cycle in F that had at least 
one node missing in V . The sole player that had an incentive to move in that 
state in T now no longer has that option, and if he has any other strategy, the 
resulting state cannot be in C, because C never uses more than 2 strategies of 
any player i in combination with any fixed s_i. Thus, any other strategy is not 
an improvement for that player, either, and this new state is thus a pure Nash 
equilibrium in V . 

Lastly, if V contains neither s* nor any nodes of C , taking the highest-index 
strategy for each player yields a profile that has to be a pure Nash equilibrium, 
because the utilities of non-C, non-s* profiles are just k). 

Thus, every subgame is guaranteed to have a pure Nash equilibrium, and, 
due to C, both games are not weakly acyclic. The theorem holds for games with 
more strategies by padding the examples given above. □ 

With 4 or more players, a more mechanistic approach produces analogous 
examples even with just 2 strategies per player: 

Theorem 4.8. In a strict n-player game for an arbitrary n > 4, the exis- 
tence of pure Nash equilibria in every subgame is insufficient to guarantee weak 
acyclicity, even with only 2 strategies per player. 

Proof. For strategy profiles in {0, 1}™, using indices mod n, set the utilities to: 

at s= (1,...,1) 

when Sj_! = s> = 1, s_ (i _ M ) = 
when Si = 1, = 
else (for the "sheath"). 



u(s) 



(4, 
(3, 

<(3, 



•4) 



' 3? 2 , 3, 

i'th 

,3, 2 

i + I'th 



..,3) 
3,. ..,3) 
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Similarly to Theorem l4.71 this plants a global pure Nash equilibrium at (1, . . . , 1), 
and creates a "fragile" better-response cycle. Here, the cycle alternates between 
profiles with edit distance n— 1 and n — 2 from the global pure Nash equilibrium. 
At every point of the cycle, the only non-sheath profiles 1 step away are its pre- 
decessor and successor on the cycle, so the cycle is persistent. Because each 
profile with edit distance n — 1 from the equilibrium is covered, removing any 
player's 1 strategy breaks the cycle, thus guaranteeing a pure Nash equilibrium 
in every subgame by the same reasoning as above. □ 

We note that the fixed-size examples that demonstrate the negative results 
above — in Theorems 14.11 B~T1 and !4.8l — easily extend to games with extra strate- 
gies for some or all players, or with extra players, by "padding" the added part of 
the payoff table with negative, unique values that, for the added profiles, make 
payoffs independent of the other players, such as, e.g., Ui(s) = — Sj. This pre- 
serves SS, SSS, and USS properties without changing weak acyclicity. Thus, this 
completes our classification of weak acyclicity under the three subgame-based 
properties, as shown in Table Q] 

5 Concluding remarks 

The connection between weak acyclicity and unique subgame stability that we 
present is surprising, but not immediately practicable: in most succinct game 
representations, there is no reason to believe that checking unique subgame 
stability will be tractable in many general settings. In a complexity-theoretic 
sense, USS is closer to tractability than weak acyclicity: Any reasonable game 
representation will have some "reasonable" representation of subgames, i.e., one 
in which checking whether a state is a pure Nash equilibrium is tractable, which 
puts unique subgame stability in a substantially easier complexity class, H3P, 
than the class PSPACE for which weak acyclicity is complete in many games. 

We leave open the important question of finding efficient algorithms for 
checking unique subgame stability, which may well be feasible in particular 
classes of games. Also open and relevant, of course, is the question of more 
broadly applicable and tractable conditions for weak acyclicity. In particular, 
there may well be other levels of the subgame stability hierarchy between SSS 
and USS that could give us weak acyclicity in broader classes of games. 
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